concentration rate
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
Feature Preserving Shrinkage on Bayesian Neural Networks via the R2D2 Prior
Chan, Tsai Hor, Zhang, Dora Yan, Yin, Guosheng, Yu, Lequan
Bayesian neural networks (BNNs) treat neural network weights as random variables, which aim to provide posterior uncertainty estimates and avoid overfitting by performing inference on the posterior weights. However, the selection of appropriate prior distributions remains a challenging task, and BNNs may suffer from catastrophic inflated variance or poor predictive performance when poor choices are made for the priors. Existing BNN designs apply different priors to weights, while the behaviours of these priors make it difficult to sufficiently shrink noisy signals or they are prone to overshrinking important signals in the weights. To alleviate this problem, we propose a novel R2D2-Net, which imposes the R^2-induced Dirichlet Decomposition (R2D2) prior to the BNN weights. The R2D2-Net can effectively shrink irrelevant coefficients towards zero, while preventing key features from over-shrinkage. To approximate the posterior distribution of weights more accurately, we further propose a variational Gibbs inference algorithm that combines the Gibbs updating procedure and gradient-based optimization. This strategy enhances stability and consistency in estimation when the variational objective involving the shrinkage parameters is non-convex. We also analyze the evidence lower bound (ELBO) and the posterior concentration rates from a theoretical perspective. Experiments on both natural and medical image classification and uncertainty estimation tasks demonstrate satisfactory performance of our method.
- Asia > China > Hong Kong (0.05)
- North America > United States > North Carolina (0.04)
- North America > United States > California > Monterey County > Monterey (0.04)
- (2 more...)
- Research Report (0.64)
- Instructional Material > Course Syllabus & Notes (0.34)
- Education > Educational Setting > Higher Education (0.46)
- Health & Medicine > Diagnostic Medicine > Imaging (0.34)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
STONet: A novel neural operator for modeling solute transport in micro-cracked reservoirs
Haghighat, Ehsan, Adeli, Mohammad Hesan, Mousavi, S Mohammad, Juanes, Ruben
In this work, we develop a novel neural operator, the Solute Transport Operator Network (STONet), to efficiently model contaminant transport in micro-cracked reservoirs. The model combines different networks to encode heterogeneous properties effectively. By predicting the concentration rate, we are able to accurately model the transport process. Numerical experiments demonstrate that our neural operator approach achieves accuracy comparable to that of the finite element method. The previously introduced Enriched DeepONet architecture has been revised, motivated by the architecture of the popular multi-head attention of transformers, to improve its performance without increasing the compute cost. The computational efficiency of the proposed model enables rapid and accurate predictions of solute transport, facilitating the optimization of reservoir management strategies and the assessment of environmental impacts. The data and code for the paper will be published at https://github.com/ehsanhaghighat/STONet.
Concentration of Cumulative Reward in Markov Decision Processes
Sayedana, Borna, Caines, Peter E., Mahajan, Aditya
In this paper, we investigate the concentration properties of cumulative rewards in Markov Decision Processes (MDPs), focusing on both asymptotic and non-asymptotic settings. We introduce a unified approach to characterize reward concentration in MDPs, covering both infinite-horizon settings (i.e., average and discounted reward frameworks) and finite-horizon setting. Our asymptotic results include the law of large numbers, the central limit theorem, and the law of iterated logarithms, while our non-asymptotic bounds include Azuma-Hoeffding-type inequalities and a non-asymptotic version of the law of iterated logarithms. Additionally, we explore two key implications of our results. First, we analyze the sample path behavior of the difference in rewards between any two stationary policies. Second, we show that two alternative definitions of regret for learning policies proposed in the literature are rate-equivalent. Our proof techniques rely on a novel martingale decomposition of cumulative rewards, properties of the solution to the policy evaluation fixed-point equation, and both asymptotic and non-asymptotic concentration results for martingale difference sequences.
- North America > Canada > Quebec > Montreal (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
- (5 more...)
Convergence rates of a partition based Bayesian multivariate density estimation method
Linxi Liu, Dangna Li, Wing Hung Wong
We study a class of non-parametric density estimators under Bayesian settings. The estimators are obtained by adaptively partitioning the sample space. Under a suitable prior, we analyze the concentration rate of the posterior distribution, and demonstrate that the rate does not directly depend on the dimension of the problem in several special cases. Another advantage of this class of Bayesian density estimators is that it can adapt to the unknown smoothness of the true density function, thus achieving the optimal convergence rate without artificial conditions on the density.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.84)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Adaptive posterior concentration rates for sparse high-dimensional linear regression with random design and unknown error variance
This paper investigates sparse high-dimensional linear regression, particularly examining the properties of the posterior under conditions of random design and unknown error variance. We provide consistency results for the posterior and analyze its concentration rates, demonstrating adaptiveness to the unknown sparsity level of the regression coefficient vector. Furthermore, we extend our investigation to establish concentration outcomes for parameter estimation using specific distance measures. These findings are in line with recent discoveries in frequentist studies. Additionally, by employing techniques to address model misspecification through a fractional posterior, we broaden our analysis through oracle inequalities to encompass the critical aspect of model misspecification for the regular posterior. Our novel findings are demonstrated using two different types of sparsity priors: a shrinkage prior and a spike-and-slab prior.
- North America > United States > New York (0.04)
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Norway > Central Norway > Trøndelag > Trondheim (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Posterior concentrations of fully-connected Bayesian neural networks with general priors on the weights
Bayesian approaches for training deep neural networks (BNNs) have received significant interest and have been effectively utilized in a wide range of applications. There have been several studies on the properties of posterior concentrations of BNNs. However, most of these studies only demonstrate results in BNN models with sparse or heavy-tailed priors. Surprisingly, no theoretical results currently exist for BNNs using Gaussian priors, which are the most commonly used one. The lack of theory arises from the absence of approximation results of Deep Neural Networks (DNNs) that are non-sparse and have bounded parameters. In this paper, we present a new approximation theory for non-sparse DNNs with bounded parameters. Additionally, based on the approximation theory, we show that BNNs with non-sparse general priors can achieve near-minimax optimal posterior concentration rates to the true model.
- Asia > South Korea > Seoul > Seoul (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Masked Bayesian Neural Networks : Theoretical Guarantee and its Posterior Inference
Kong, Insung, Yang, Dongyoon, Lee, Jongjin, Ohn, Ilsang, Baek, Gyuseung, Kim, Yongdai
Bayesian approaches for learning deep neural networks (BNN) have been received much attention and successfully applied to various applications. Particularly, BNNs have the merit of having better generalization ability as well as better uncertainty quantification. For the success of BNN, search an appropriate architecture of the neural networks is an important task, and various algorithms to find good sparse neural networks have been proposed. In this paper, we propose a new node-sparse BNN model which has good theoretical properties and is computationally feasible. We prove that the posterior concentration rate to the true model is near minimax optimal and adaptive to the smoothness of the true model. In particular the adaptiveness is the first of its kind for node-sparse BNNs. In addition, we develop a novel MCMC algorithm which makes the Bayesian inference of the node-sparse BNN model feasible in practice.
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Variational approximations of empirical Bayes posteriors in high-dimensional linear models
In high-dimensions, the prior tails can have a significant effect on both posterior computation and asymptotic concentration rates. To achieve optimal rates while keeping the posterior computations relatively simple, an empirical Bayes approach has recently been proposed, featuring thin-tailed conjugate priors with data-driven centers. While conjugate priors ease some of the computational burden, Markov chain Monte Carlo methods are still needed, which can be expensive when dimension is high. In this paper, we develop a variational approximation to the empirical Bayes posterior that is fast to compute and retains the optimal concentration rate properties of the original. In simulations, our method is shown to have superior performance compared to existing variational approximations in the literature across a wide range of high-dimensional settings.
- North America > United States > North Carolina (0.04)
- Asia > Middle East > Jordan (0.04)
Convergence rates of a partition based Bayesian multivariate density estimation method
Liu, Linxi, Li, Dangna, Wong, Wing Hung
We study a class of non-parametric density estimators under Bayesian settings. The estimators are obtained by adaptively partitioning the sample space. Under a suitable prior, we analyze the concentration rate of the posterior distribution, and demonstrate that the rate does not directly depend on the dimension of the problem in several special cases. Another advantage of this class of Bayesian density estimators is that it can adapt to the unknown smoothness of the true density function, thus achieving the optimal convergence rate without artificial conditions on the density. We also validate the theoretical results on a variety of simulated data sets.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)